Internet Info 1997 December

home *** CD-ROM | disk | FTP | other *** search

/ Internet Info 1997 December / Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso / ietf / urn / urn-archives / urn-ietf.archive.9611 / 000132_owner-urn-ietf _Tue Nov 12 10:22:20 1996.msg < prev next >

Wrap

Internet Message Format | 1997-02-19 | 7KB

Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id KAA19591 for urn-ietf-out; Tue, 12 Nov 1996 10:22:20 -0500 Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id KAA19586 for <urn-ietf@services.bunyip.com>; Tue, 12 Nov 1996 10:22:17 -0500 Received: from windrose.omaha.ne.us by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA01013 (mail destined for urn-ietf@services.bunyip.com); Tue, 12 Nov 96 10:22:11 -0500 Message-Id: <9611121522.AA01013@mocha.bunyip.com> Received: by privateer.windrose.omaha.ne.us; Tue Nov 12 09:21 CST 1996 From: "Ryan Moats" <jayhawk@ds.internic.net> To: "Dirk.vanGulik@jrc.it" <Dirk.vanGulik@jrc.it>, "Harald.T.Alvestrand@uninett.no" <Harald.T.Alvestrand@uninett.no> Cc: "FisherM@is3.indy.tce.com" <FisherM@is3.indy.tce.com>, "girod@LCS.MIT.EDU" <girod@LCS.MIT.EDU>, "mduerst@ifi.unizh.ch" <mduerst@ifi.unizh.ch>, "moore@cs.utk.edu" <moore@cs.utk.edu>, "tallen@fsc.fujitsu.com" <tallen@fsc.fujitsu.com>, "urn-ietf@bunyip.com" <urn-ietf@bunyip.com> Date: Tue, 12 Nov 96 09:22:43 Priority: Normal X-Mailer: PMMail 1.52 For OS/2 UNREGISTERED SHAREWARE Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Subject: [URN] Re: Harald's syntax proposals [Somewhat Long] (was I18N does not belong in URNs) Sender: owner-urn-ietf@services.bunyip.com Precedence: bulk Reply-To: "Ryan Moats" <jayhawk@ds.internic.net> Errors-To: owner-urn-ietf@bunyip.com On Tue, 12 Nov 1996 14:09:34 +0100, Dirk.vanGulik wrote: >Harald wrote: >> There are about 3 alternatives I can see: > >A >> - The URN syntax doc says that URNs are sequences of ASCII >> characters (or some subset thereof) > >B >> - The URN syntax doc says that URNs are sequences of OCTETS, >> with no meaning assigned by the URN syntax doc after the >> second : > >C >> - The URN syntax doc says that URNs are sequences of CHARACTERS, >> drawn from the ISO 10646 set. > >> The tradeoffs are different for the 3 cases. > >To extend on this, IMHO very sensible, concept: as it gives room >to sensible wire definitions: > >D. Augmenting, case 'B'; > >The URN syntax docs says the URNs are sequences of >OCTETS, whose value comes from a specific range '*' with no meaning >assigned by the URN sysntax doc after the first four octets (which >are the indexes of the glyphs 'u','r','n' and ':' in charset-XYZ.) > >Where the range '*' is; when each octed is taken as in index into >charset-XYZ, is from the range (say) A-Z, 0-9, '_', '-', ":' and '.' > >I'd suggest ISO-8859, US-ASCII, NISO, etc for charset-XYZ. > >For each naming scheme an interpretation/casting can be defined for >the part of the URN after the name-scheme-identifier. > >For the 'inet' (or x-dns-2) name-schme; the suggested interpretation is >as indexes into 8 bit encoded 7bit ASCII (or ISO-8859-1 or whatever). > >Would that be of (some) use? > >The reason for _not_ allowing any octet in the string after urn: is to >avoid entering into the dark realm of having to retrive MIB like display >strings whenever you encounter a URN you've not seen before; and which >you would like to display to a human in such a way he/she can still >transcribe it. Just showing a list of two digit hex numbers would not do. > >Although overloading yet another DNS query type to get the display string >could be fun on the 'display-string'.name-scheme.urn.net level :-) > >Dw. To me, the extension (as written) has a couple of holes. The URN has to have information about the namespace so that the resolver can resolve the URN correctly (it took me a while to realize that the proper parsing of option B is the second ":"). Therefore, having no meaning after the optional "urn:" doesn't work. The second hole is that while having an interpretation for a specific namespace defined in a separate namespace document is reasonable (in fact I don't see how to avoid it) defining general castings in external documents doesn't strike me as a big win. I still believe we can keep the syntax doc reasonably short while solving the small problem in front of us. This being said, if the holes are plugged up, option D could be a reasonable alternative Now, back to Harald's proposals, I'll try to synopsize them with my thoughts on the tradeoffs >A >> - The URN syntax doc says that URNs are sequences of ASCII >> characters (or some subset thereof) This is of course, the simplest. A minimum of transport encoding would be involved. There may be some reserved characters that would be need to be encoded on a namespace specific basis but those could be done with %HH encoding. >B >> - The URN syntax doc says that URNs are sequences of OCTETS, >> with no meaning assigned by the URN syntax doc after the >> second ':' I've modified the above to point up the ':' character. This is the "opaque string" option. This option would require some transport encoding for 8-bit unclean pipes. I see presentation as the major trade-off here. For 8-bit unfriendly schemes, something like the %HH scheme would be needed. For others, the %HH scheme could be mixed with actual glyphs that are representation of that octet (or sequence of octets) in that presentation scheme. However, THIS IS ONLY FOR THE CONVIENCE OF THE PRESENTATION SCHEME, NOT THE USER!!!! The other difficulty is that reserved characters have to be limited to a subset of ASCII or there is the possibility for encoding collisions that would require the syntax doc to specify its own encoding scheme. >C >> - The URN syntax doc says that URNs are sequences of CHARACTERS, >> drawn from the ISO 10646 set. > >> The tradeoffs are different for the 3 cases. This last one has most of the issues above. There are probably others that I haven't thought of, mainly because I don't claim to be an expert in the ins and outs of 7-bit and 8-bit friendliness. Having synopsized the plusses and minuses of these, there are some common issues: If we restrict the range of octets for the NSS (this is option A and option D) then a lot of the trasnport encoding, reserved character encoding and related issues go away. If we allow any octet (option B and C) then transport encoding becomes a minor issue and reserved character encoding must be considered. Everything else (to me right now at least) is related to presentation and meta-interpretation (by humans) of the NSS. I'm not sure if these should intrude into the syntax doc.